Comprehensive Comparison of Gene Set Analysis Tools

نویسندگان

  • Zheng Liu
  • Xuejun Li
  • Yate-Ching Yuan
  • Xiwei Wu
چکیده

Gene set analysis has enhanced the microarray data analysis field with biological insights. The first introduced and widely used Over-representation analysis (ORA) method, has the limitation of the requirement of a predetermined differentially expressed genes list. To overcome this limitation, distribution based analysis (DBA) methods were developed with different analysis steps and null hypothesis. To understand the advantages and limitations of these methods, we present a comprehensive survey and evaluate the performance for nine commonly used gene set analysis tools. Methods testing self-contained hypothesis generally have better sensitivity and specificity than methods testing competitive hypothesis. But most of the methods have bias towards larger gene sets with self-contained methods more severe. Therefore, better sensitivity and specificity is obtained at the tradeoff of bigger bias in self-contained methods, and vice versa in competitive methods. We propose a combined performance plot to compare these methods, among which GSA demonstrated superiority over others.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A comprehensive in silico analysis of pathogenic nsSNPs in the NT5C2 gene involved in relapsed ALL

Background: About 10-20% of children suffering from acute lymphoblastic leukemia (ALL), experience a relapse, which is a major cause of their death. Purine nucleotide analogs are frequently prescribed to maintain the treatment of ALL. Cytosolic 5´-nucleotidase (NT5C2) catalyzes the 5´ dephosphorylation of purine analogs. Gain-of-function mutations in the NT5C2 gene result in resistance to the t...

متن کامل

Comprehensive Computational Analysis of Protein Phenotype Changes Due to Plausible Deleterious Variants of Human SPTLC1 Gene

Genetic variations found in the coding and non-coding regions of a gene are known to influence the structure as well as the function of proteins. Serine palmitoyltransferase long chain subunit 1 a member of α-oxoamine synthase family is encoded by SPTLC1 gene which is a subunit of enzyme serine palmitoyltransferase (SPT). Mutations in SPTLC1 have been associated with hereditary sensory and auto...

متن کامل

Sequence and Phylogenetic Analysis of Wild Type Rubella virus isolated in Iran

Background and Aims: Rubella virus is a human pathogen that causes congenital rubella syndrome (CRS) when infection occurs during early pregnancy. Vaccination programs have been remarkably successful in controlling natural rubella infection and CRS. Moreover, ongoing surveillance for all cases of rubella and CRS is a vital component of a prevention program. Although the WHO recommends the use o...

متن کامل

eSAGE: managing and analysing data generated with Serial Analysis of Gene Expression (SAGE)

SUMMARY eSAGE is a comprehensive set of software tools for managing and analysing data generated with Serial Analysis of Gene Expression (SAGE).

متن کامل

A comprehensive comparison of tools for differential ChIP-seq analysis

ChIP-seq has become a widely adopted genomic assay in recent years to determine binding sites for transcription factors or enrichments for specific histone modifications. Beside detection of enriched or bound regions, an important question is to determine differences between conditions. While this is a common analysis for gene expression, for which a large number of computational approaches hav...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011